Lets keep it simple: using simple architectures to outperform deeper architectures

نویسندگان

  • Seyyed Hossein HasanPour
  • Mohammad Rouhani
  • Javad Vahidi
چکیده

In recent years, all winning architectures that achieved state of the art results have been very deep and had parameters ranging from tens to hundreds of millions. While an optimal depth has yet to be discovered, it is known that these deep architectures are far from being optimal. Very deep and heavy architecture such as VGGNet, GoogleNet, ResNet and the likes, are very demanding in terms of hardware requirements, and so their practical use has become very limited and costly. The computational and memory overhead caused by such architectures has a negative effect on the expansion of methods and applications utilizing deep architectures. In order to overcome this issues, some thinned architectures such as Squeezenet are proposed that are computationally lightweight and useful for embedded systems. However their usage is also hindered by the low accuracy they provide. While deep architectures do provide good accuracy, we empirically show that a wellcrafted yet simple and reasonably deep architecture can equally perform. This allows for more practical uses, especially in embed systems, or systems with computational and memory limitations. In this work, we present a very simple fully convolutional network with 13 layers that outperforms almost all deeper architectures to date such as ResNet, GoogleNet,WRN, etc with 2 to 25 times fewer number of parameters, and rarely when it does not supersede an architecture, it performs on par. We achieved state of the art results and very close to it on datasets such as CIFAR10/100, MNIST and SVHN with simple or no data-augmentation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل عملکردی تحلیلی FPGA برای پردازش با قابلیت پیکربندی مجدد

Optimizing FPGA architectures is one of the key challenges in digital design flow. Traditionally, FPGA designers make use of CAD tools for evaluating architectures in terms of the area, delay and power. Recently, analytical methods have been proposed to optimize the architectures faster and easier. A complete analytical power, area and delay model have received little attention to date. In addi...

متن کامل

Comparing Shallow versus Deep Neural Network Architectures for Automatic Music Genre Classification

In this paper we investigate performance differences of different neural network architectures on the task of automatic music genre classification. Comparative evaluations on four well known datasets of different sizes were performed including the application of two audio data augmentation methods. The results show that shallow network architectures are better suited for small datasets than dee...

متن کامل

Going Deeper in Spiking Neural Networks: VGG and Residual Architectures

Over the past few years, Spiking Neural Networks (SNNs) have become popular as a possible pathway to enable low-power event-driven neuromorphic hardware. However, their application in machine learning have largely been limited to very shallow neural network architectures for simple problems. In this paper, we propose a novel algorithmic technique for generating an SNN with a deep architecture, ...

متن کامل

CUHK&SIAT Submission for THUMOS15 Action Recognition Challenge

This paper presents the method of our submission for THUMOS15 action recognition challenge. We propose a new action recognition system by exploiting very deep twostream ConvNets and Fisher vector representation of iDT features. Specifically, we utilize those successful very deep architectures in images such as GoogLeNet and VGGNet to design the two-stream ConvNets. From our experiments, we see ...

متن کامل

Convolutional Sequence Modeling Revisited

Although both convolutional and recurrent architectures have a long history in sequence prediction, the current “default” mindset in much of the deep learning community is that generic sequence modeling is best handled using recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016